A Case for Standard Non-blocking Collective Operations

نویسندگان

  • Torsten Hoefler
  • Prabhanjan Kambadur
  • Richard L. Graham
  • Galen M. Shipman
  • Andrew Lumsdaine
چکیده

In this paper we make the case for adding standard nonblocking collective operations to the MPI standard. The non-blocking point-to-point and blocking collective operations currently defined by MPI provide important performance and abstraction benefits. To allow these benefits to be simultaneously realized, we present an application programming interface for non-blocking collective operations in MPI. Microbenchmark and application-based performance results demonstrate that non-blocking collective operations offer not only improved convenience, but improved performance as well, when compared to manual use of threads with blocking collectives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

D RA FT 10 / 1 4 / 20 08 Non - Blocking Collective Operations for MPI - 3 The MPI - 3 Collective Operations

We propose new non-blocking interfaces for the collective group communication functions defined in MPI1 and MPI-2. This document is meant as a standard extension and written in the same way as the MPI standards. It covers the MPI-API as well as the semantics of the new operations.

متن کامل

A Case for Non-blocking Collective Operations

Non-blocking collective operations for MPI have been in discussion for a long time. We want to contribute to this discussion and to give a rationale for the usage these operations and assess their possible benefits. A LogGP model for the CPU overhead of collective algorithms and a benchmark to measures it are provided and show a large potential to overlap communication and computation. We show ...

متن کامل

MPI collectives at scale

Collective operations improve the performance and reduce code complexity of many applications parallelized with the messagepassing interface (MPI) paradigm. In this article, we will investigate the impact of load imbalance on the performance of collective operations and possibility for hiding parallel overhead caused by a collective communication pattern, by overlapping the communication with c...

متن کامل

Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations

This paper presents a case study about the applicability and usage of non blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable low-overhead implementation of non blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with th...

متن کامل

Implications of application usage characteristics for collective communication offload

The performance of collective communication operations is known to have a significant impact on the scalability of some applications. Indeed, the global, synchronous nature of some collective operations directly implies that they will become the bottleneck when scaling to hundreds of thousands of nodes. This fact has led many researchers to try to improve the efficiency of collective operations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007